Skip to content

Conversation

@dangotbanned
Copy link
Member

@dangotbanned dangotbanned commented Apr 7, 2025

Will close #2344, #2345

What type of PR is this? (check all applicable)

  • πŸ’Ύ Refactor
  • ✨ Feature
  • πŸ› Bug Fix
  • πŸ”§ Optimization
  • πŸ“ Documentation
  • βœ… Test
  • 🐳 Other

Related issues

Checklist

  • Code follows style guide (ruff)
  • Tests added
  • Documented the changes

If you have comments or can explain your changes, please do so below

  • Update main
    • TypeAliass and @overloads
    • DataFrame._compliant_frame
    • LazyFrame._compliant_frame
  • Update v1
    • LazyFrame overload
    • DataFrame overload

- Revealed two new `[overload-cannot-match]` from `mypy`
- I agreed with that and removed the conflict sources

Will close #2345
Comment on lines 415 to 435
def test_dataframe_recursive_v1() -> None:
pytest.importorskip("polars")
import polars as pl

pl_frame = pl.DataFrame({"a": [1, 2, 3]})
nw_frame = nw.from_native(pl_frame)
with pytest.raises(AssertionError):
nw.DataFrame(nw_frame, level="full")

nw_frame_early_return = nw.from_native(nw_frame)

if TYPE_CHECKING:
assert_type(pl_frame, pl.DataFrame)
# TODO @dangotbanned: Fix without breaking something else (1)
assert_type(nw_frame, nw.DataFrame[pl.DataFrame])

nw_frame_depth_2 = nw.DataFrame(nw_frame, level="full")
# NOTE: Checking that the type is `DataFrame[Unknown]`
assert_type(nw_frame_depth_2, nw.DataFrame)
# TODO @dangotbanned: Fix without breaking something else (2)
assert_type(nw_frame_early_return, nw.DataFrame[pl.DataFrame])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Spent a few hours trying to get this working
  • Nothing I've tried worked for both mypy, pyright and didn't break something elsewhere

Copy link
Member Author

@dangotbanned dangotbanned Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT, v1s overloads were always returning a Union for the case I was trying to force into DataFrame.

Gave up on trying to resolve in (c4bed59)

@@ -1,5 +1,6 @@
from __future__ import annotations

# mypy: disallow-any-generics=false, disable-error-code="var-annotated"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needed to add this to avoid mypy requiring # type: ignores.

If a line has one (regardless of it is has an error code) pyright won't ever emit a diagnostic

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved into a docstring for a better explainer (6a66779)

image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FBruzzesi in relation to (#2304 (comment))

I think there's quite a lot of things to be learned - by following this PR all the way back to the original issue (#2239) πŸ™‚

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the ping! I just went through the entire PR, my skill issue is that everything makes sense, but most likely I would not be able to (re)produce it πŸ₯²

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FBruzzesi so if you go through the PR (or (https://github.com/narwhals-dev/narwhals/pull/2347/commits)) in order of commits - there's a pattern in all of this.

Is this a workflow?

  1. Identify some typing that doesn't meet your expectation ([Bug]: TypeVar(s) used in nw.(BaseFrame|Series) are recursiveΒ #2239)
  2. Make the smallest change you think could fix it (74a03a8)
  3. Write a test that would demonstrate the bug was fixed (ff841d1)
    • I probably should've started with that, but I was eager I guess πŸ˜‰
  4. Add some notes when that didn't work (ff841d1#r2029276670)
  5. Add even more notes when all seems lost (37549cc)
  6. Try absolutely anything to fix it (6bd47b4)
    • I wasn't happy with this initially, but became more convinced when it didn't break anything else
    • Without the tests in place - I might have not even tried this
  7. Write an essay on why you think that worked (6bd47b4#r2029455047)
  8. Clean things up (50e4d13)

Back to this PR

You can follow through (aa6578e) and Next > to see a similar thing πŸ™‚

Different issues came up, but I'm mostly trying to isolate small things and see what mypy & pyright spits back out when I attempt a fix

Copy link
Member Author

@dangotbanned dangotbanned Apr 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also usually try to get the typing right first before I start making "real" code changes.

It is much harder to spot the source of an issue if you only check the types after making lots of changes

@dangotbanned dangotbanned marked this pull request as ready for review April 7, 2025 21:49
Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @dangotbanned ! just got a question

Comment on lines 417 to 420
index: list[str] | None,
values: list[str] | None,
index: str | Sequence[str] | None,
values: str | Sequence[str] | None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do these change? is Sequence[str] | None not fine?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @dangotbanned ! just got a question

Thanks @MarcoGorelli, happy to answer πŸ˜…

why do these change? is Sequence[str] | None not fine?

the original annotation was overly narrow, since polars accepts much more than a list

I followed through to pandas to see if that was the source of using list[str], but IIRC it used an alias like IndexLabel.

Sequence[str] | None is equivalent to str | Sequence[str] | None sadly (python/typing#256).
But I prefer being more explicit that it can accept "one or more columns"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it's str we already make it a list at the narwhals level, can this just be Sequence[str] | None?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MarcoGorelli sure thing!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved in (d6cb16b)

@dangotbanned
Copy link
Member Author

@MarcoGorelli I've resolved the conflicts from #2380

Are we good to go on this one?

@dangotbanned dangotbanned requested a review from FBruzzesi April 14, 2025 17:45
Comment on lines +79 to +83
class DataFrameGroupBy(
CompliantGroupBy[CompliantDataFrameT_co, CompliantExprT_contra],
Protocol38[CompliantDataFrameT_co, CompliantExprT_contra],
):
def __iter__(self) -> Iterator[tuple[Any, CompliantDataFrameT_co]]: ...
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FBruzzesi I forgot to mention this will probably be helpful in getting the typing working for (#2325).

It means you can use CompliantGroupBy for Polars* and put the unrelated parts here.

I needed to add it to resolve an unrelated issue (1815752)

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @dangotbanned !

sorry for the delay

@MarcoGorelli MarcoGorelli merged commit 24bdf86 into main Apr 15, 2025
28 of 29 checks passed
@MarcoGorelli MarcoGorelli deleted the narrow-type-var-frame branch April 15, 2025 19:33
@dangotbanned
Copy link
Member Author

thanks @dangotbanned !

sorry for the delay

68747470733a2f2f692e666c756666792e63632f5872316a6e4233723664577266777150774e6472706e46397a57316c567030462e676966

No worries @MarcoGorelli!

I'm just happy we're now in a good place to work on

😊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Narrow TypeVar used in LazyFrame Narrow TypeVar used in DataFrame

4 participants